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Drosophila CTCF tandemly aligns with other insulator 
proteins at the borders of H3K27me3 domains 
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Several multi protein DNA complexes capable of insulator activity have been identified in Drosophila melanogaster, yet only 
CTCF, a highly conserved zinc finger protein, and the transcription factor TFIIIC have been shown to function in mammals. 
CTCF is involved in diverse nuclear activities, and recent studies suggest that the proteins with which it associates and the 
DNA sequences that it targets may underlie these various roles. Here we show that the Drosophila homolog of CTCF 
(dCTCF) aligns in the genome with other Drosophila insulator proteins such as Suppressor of Hairy wing [SU(HW)] and 
Boundary Element Associated Factor of 32 kDa (BEAF-32) at the borders of H3K27me3 domains, which are also enriched 
for associated insulator proteins and additional cofactors. RNAi depletion of dCTCF and combinatorial knockdown of 
gene expression for other Drosophila insulator proteins leads to a reduction in H3K27me3 levels within repressed domains, 
suggesting that insulators are important for the maintenance of appropriate repressive chromatin structure in Polycomb 
(Pc) domains. These results shed new insights into the roles of insulators in chromatin domain organization and support 
recent models suggesting that insulators underlie interactions important for Pc-mediated repression. We reveal an im- 
portant relationship between dCTCF and other Drosophila insulator proteins and speculate that vertebrate CTCF may also 
align with other nuclear proteins to accomplish similar functions. 



[Supplemental material is available for this article.] 

Insulators were first characterized as regulatory elements that play 
an important role in establishing proper gene expression in 
eukaryotic cells. Early studies demonstrated the ability of in- 
sulators to act as barriers, preventing the spread of heterochro- 
matin and thereby demarcating chromatin boundaries, as well as 
enhancer-blockers, preventing enhancers from activating nearby 
genes in a direction-dependent manner (Gaszner and Felsenfeld 
2006; Bushey et al. 2008). Insulators have since been shown to be 
multiprotein-DNA complexes that can mediate inter- and intra- 
chromosomal interactions important for facilitating proper gene 
expression at specific loci, and more recently in genome-wide 
chromatin organization (Phillips and Corces 2009). Insulator 
activity in vertebrates requires the essential, highly conserved, 
CCCTC-binding factor CTCF. Recent genome-wide studies have 
effectively mapped both the mammalian CTCF binding sites and 
the chromatin interactions that they facilitate (Kim et al. 2007; 
Handoko et al. 2011). However, how CTCF mediates these in- 
teractions and the nature of the proteins required for functional 
insulator activity remains poorly understood. 

The CTCF insulator protein contains a highly conserved cen- 
tral domain encoding 1 1 zinc fingers, and is ubiquitously expressed 
(Klenova et al. 1993). Interestingly, CTCF has been implicated in 
numerous unique nuclear functions in addition to the classical 
enhancer-blocking and barrier activities that define insulators. 
These include X-chromosome inactivation (Chao et al. 2002), nu- 
cleolar stability (Guerrero and Maggert 2011), V(D)J recombination 
(Guo et al. 2011), and global chromatin organization (Kim et al. 
2007; Handoko et al. 2011). The combinatorial use of its 11 zinc 
fingers in binding to discrete DNA target sequences, as well as the 
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diverse, context-dependent protein-interaction networks of CTCF, 
have been proposed to underlie these numerous roles (Filippova 
et al. 1996; Zlatanova and Caiafa 2009; Weth and Renkawitz 

2011) . Meanwhile, recent studies in both D. melanogaster and hu- 
mans have demonstrated that CTCF appears to demarcate physical 
chromatin domains (Dixon et al. 2012; Nora et al. 2012; Sexton et al. 

2012) , including a subset of repressive H3K27me3 domains (Bartkuhn 
et al. 2009; Cuddapah et al. 2009). However, the proteins with 
which CTCF associates and the purpose for which CTCF de- 
marcates chromatin boundaries requires further exploration. 

In Drosophila, several insulator binding proteins have been 
identified and characterized, including the Drosophila homolog 
of CTCF (dCTCF), Boundary element associated factor of 32 kDa 
(BEAF-32), and Suppressor of Hairy wing [SU(HW)] (Gurudatta and 
Corces 2009). These DNA-binding proteins require additional 
proteins for functional insulator activity, including Centrosomal 
protein 190 (CP190) and Modifier of mdg4 [MOD(MDG4)] (Ghosh 
et al. 2001; Pai et al. 2004; Gerasimova et al. 2007). We have 
recently identified the genome-wide binding sites of dCTCF, BEAF- 
32, SU(HW), and CP190 with high-resolution ChlP-seq and dem- 
onstrated that recruitment of these proteins is regulated during the 
ecdysone response in D. melanogaster (Wood et al. 201 1). However, 
the functional relationship between these different insulator pro- 
teins remains unknown. 

Here we present a comprehensive map of direct insulator- 
binding sites throughout the Drosophila genome and show that as 
many as 40% of dCTCF sites align tightly with the Drosophila 
specific insulators SU(HW) and BEAF-32. dCTCF sites are enriched 
for three similar but distinct DNA motifs, potentially representing 
discrete binding modes throughout the Drosophila genome. Aligned 
insulators are enriched for additional cofactors and commonly oc- 
cur at the borders of H3K27me3 domains, where they are essential 
for maintaining appropriate chromatin structure. Surprisingly, we 
find that disruption of insulators genome wide by knockdown of 
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insulator components does not significantly affect the expression 
of genes flanking H3K27me3 domains, nor does H3K27me3 spread 
beyond domain borders as one might expect based on classical 
barrier models for insulator function at chromatin boundaries. 
Instead, H3K27me3 is lost within domains, suggesting that chro- 
matin insulators serve an important role in the maintenance of 
silenced chromatin in Polycomb (Pc) domains. Our findings sup- 
port recently proposed models, wherein chromatin insulators are 
involved in mediating long-range interactions important for 
Polycomb (Pc)-mediated repression (Pirrotta and Li 2011). 

Results 

dCTCF sites align with Drosophila-spedflc insulator proteins 
SU(HW) and BEAF-32 

Two recent studies independently identified the genome-wide 
binding sites of insulator proteins in Drosophila melanogaster by 
combining chromatin immunoprecipitation with microarray hy- 
bridization (ChlP-chip). Whereas one study demonstrated unique 
genome-wide distributions and gene ontologies between dCTCF, 
SU(HW), and BEAF-32 (Bushey et al. 2009), the other observed an 
enrichment of dCTCF and BEAF-32 as colocalizing, and therefore 
split insulators into two main classes: dCTCF/BEAF-32/CP190 
(Class I) and SU(HW) (Class II) (Negre et al. 2010). However, pre- 
vious studies have shown that CP 190 is an essential component 
of both the dCTCF and SU(HW) insulators (Pai et al. 2004; 
Gerasimova et al. 2007). The functional implications of insulator 
classes and why dCTCF might colocalize with other insulator 
proteins requires further exploration. 

To better determine the genome-wide binding sites of Dro- 
sophila insulators with greater accuracy and resolution, we recently 
re-mapped dCTCF, SU(HW), BEAF-32, and CP190 sites by com- 
bining chromatin immunoprecipitation with high-throughput 
sequencing (ChlP-seq). Here we analyze peaks repeatedly called in 
three independent ChlP-seq experiments during the ecdysone re- 
sponse in Drosophila Kc cells, which are therefore most likely to 
represent real, stable insulator binding sites (Wood et al. 2011). We 
then determined enriched consensus sequence motifs by MEME- 
ChlP (Machanick and Bailey 2011). Results confirm previously 
identified position weight matrices for each respective insulator 
protein (Ramos et al. 2006; Holohan et al. 2007; Negre et al. 2010; 
Supplemental Fig. SI). Given the ability of distant insulator pro- 
teins to interact with each other, it is possible that different in- 
sulator proteins bound to these sites may coprecipitate during the 
ChIP procedure, thus appearing to colocalize, when in fact they 
are located in different genomic locations. As a consequence, we 
speculate that genome-wide binding profiles for each insulator 
protein likely contain many indirect binding sites. Therefore, we 
excluded sites that do not contain appropriate target sequences for 
each respective insulator protein (see Methods), thereby providing 
a stringent list of insulator sites that are highly likely to represent 
real, direct binding sites for each protein. 

Results from this analysis suggest that insulator proteins 
indeed cluster together in the genome often as previously reported 
(Negre et al. 2010) and do so while binding their own discrete 
target sequence. As many as 40% of all dCTCF sites align with 
SU(HW) or BEAF-32, and as many as 5% of all dCTCF sites tightly 
align with both SU(HW) and BEAF-32 (Fig. 1A,B). Though previous 
studies broke insulators into two or three classes, we find that 
dCTCF aligns with SU(HW) (432; 49% of aligned sites) and/or 
BEAF-32 (572; 64% of aligned sites) at many sites. Given the 



number of SU(HW) binding sites throughout the genome [4466 
sites with SU(HW) consensus, Fig. 1C], earlier correlation 
analyses of colocalization were likely biased by thousands of in- 
dependent SU(HW) sites. The high resolution obtained by ChlP- 
seq demonstrates that these insulators align tightly, within only 
200-300 bp (Fig. IB), and sequential ChIP for insulator proteins 
dCTCF, BEAF-32, and SU(HW) suggests that these proteins colo- 
calize at these sites in individual cells (Supplemental Fig. S2). In 
addition, by removing insulator sites lacking known target se- 
quences, we demonstrate that each insulator protein binds to its 
own target sequence (notice DNA sequence) (Fig. IB), and overlap 
is not a consequence of indirect binding. The alignment of dCTCF 
with both SU(HW) and BEAF-32 suggests the possibility of syner- 
gistic cooperation in insulator function. 

dCTCF sites are enriched for multiple DNA motifs 

In addition to its ability to interact with numerous nuclear pro- 
teins, the versatility of CTCF in genome biology may also be at- 
tributable to its wide range of potential target sequences. However, 
genome-wide analyses of CTCF binding sites have revealed a pri- 
marily enriched core target sequence that is strikingly similar be- 
tween invertebrates and vertebrates despite millions of years of 
evolution (Holohan et al. 2007; Supplemental Fig. SI). This is not 
entirely surprising given that CTCF encodes 1 1 highly conserved 
zinc fingers that confer target specificity. However, early charac- 
terization of CTCF identified its ability to bind to a wide range 
of sequences dependent on its combinatorial use of zinc fingers 
(Filippova et al. 1996; Ohlsson et al. 2001), and recent work has 
identified similar regulatory elements bound by CTCF in the hu- 
man genome (Xie et al. 2007). These data suggest that CTCF may 
bind to unique DNA target sequences not represented in the con- 
served target sequence. 

Motif analysis of dCTCF ChlP-seq data by MEME-ChIP 
(Machanick and Bailey 2011) indeed identifies the primary con- 
sensus sequence of dCTCF as previously reported (Fig. 2; Supple- 
mental Fig. SI). However, the results also indicate strong enrich- 
ments for a strikingly similar but novel secondary consensus 
sequence (Fig. 2), also independently obtained using Weeder 1.3 
(Pavesi et al. 2004), suggesting that the variability in target se- 
quence specificity holds true for Drosophila. There is also en- 
richment for an additional motif accounting for <10% of dCTCF 
sites (Fig. 2). Comparison of dCTCF read intensities at these three 
motifs suggests that the highly conserved core consensus (motif 1) 
recruits higher occupancy levels of dCTCF, whereas motifs 2 and 3 
recruit lower but similar occupancy levels (Supplemental Fig. S3). 
This finding is similar to previous reports suggesting that CTCF 
targets different occupancy-based motif classes in vertebrates 
(Essien et al. 2009), and suggests that these unique target sites may 
underlie distinct roles. 

Studies in CP 190 mutants demonstrated a dependence of 
dCTCF on CP190 for binding to a subset of its DNA-binding sites 
on polytene chromosomes (Gerasimova et al. 2007; Mohan et al. 
2007). Earlier studies have shown that although CP 190 physically 
associates with insulator proteins SU(HW) and MOD(MDG4)2.2 
[also referred to as MOD(MDG4)67.2] and is essential for func- 
tional insulator activity, it does not directly bind to insulator se- 
quences present on the gypsy retrotransposon (Pai et al. 2004), and 
therefore likely relies on dCTCF and SU(HW) to associate with 
insulator sites. In support of this notion, recent biochemical 
studies demonstrated that CP190 function at dCTCF, SU(HW), and 
BEAF-32 sites requires its BTB/POZ (protein interaction) domains, 
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Figure 1. dCTCF aligns with Drosophila insulator-binding proteins SU(HW) and BEAF-32. (A) Venn 
diagram depicting overlap of dCTCF binding sites with those of BEAF-32 and SU(HW). Overlap repre- 
sented as number of sites (summits ±150 bp) in which dCTCF intersects BEAF-32 and/or SU(HW) and 
target sequences are present for each insulator protein, suggesting close alignment (within 1 50 bp). (B) 
Example of ChlP-seq profile for dCTCF, BEAF-32, SU(HW), and CP1 90 on chromosome 3L; in which case 
dCTCF aligns with both BEAF-32 and SU(HW), where each cognate target is present. (C) Number of sites 
in which dCTCF, BEAF-32, and SU(HW) contain appropriate target sequences. Percentages of sites in 
which dCTCF, BEAF-32, and SU(HW) align closely with other DNA-binding insulator proteins. Most (90%) 
alignments include dCTCF, and as many as 40% of dCTCF sites align with either BEAF-32 and/or SU(HW). 



whereas its zinc fingers were dispensable (Oliver et al 2010). In- 
terestingly, we find significant enrichments for CP 190 at dCTCF sites 
containing motifs 2 or 3 when compared with the primary conserved 
consensus (Fig. 2B). Given the dependence of dCTCF on CP 190 for 
binding to a subset of its sites, we speculate that interactions between 
dCTCF and CP 190 facilitate its interaction with these low-occupancy 
and presumably lower-affinity target sequences. Furthermore, we 
find enrichments for the novel secondary sequence at sites where 
dCTCF aligns with BEAF-32 and where dCTCF aligns with both 
BEAF-32 and SU(HW) (Supplemental Fig. S4), suggesting that 
these DNA target sequences exhibit distinct features with respect 
to insulator recruitment and alignment. 

dCTCF recruits unique MOD[MDG4) isoform[s) 

The recruitment of CP 190 is of particular interest given its ability to 
form stable homodimers and homotetramers in vitro, supporting 
the notion that active insulators function through loop formation 
via interactions with other insulators. CP 190 contains a unique 
BTB domain that excludes it from the ttk (tramtrack) group of BTB/ 
POZ proteins and inhibits it from interacting with ttk members 
(Bonchuk et al. 2011). However, SU(HW) was first characterized as 
recruiting an additional BTB/POZ protein, MOD(MDG4), also es- 
sential for insulator activity (Gerasimova et al. 1995; Ghosh et al. 



/ / / / / / / / / 2001 )- MOD(MDG4), in fact, belongs to 

the ttk group of BTB/POZ containing 
proteins, and has been shown to form 
higher-order homo-oligomers (Bonchuk 
et al. 2011). Meanwhile, the observation 
that in diploid cell nuclei insulators form 
large structures called insulator bodies 
suggests that many insulators associate 
together within the nucleus, and the 
presence of CP190 and MOD(MDG4) sup- 
ports this possibility (Ghosh et al. 2001; 
Gerasimova et al. 2007). 

Whereas CP 190 has been shown 
to associate with dCTCF, SU(HW), and 
BEAF-32 (Bushey et al. 2009), currently 
only the SU(HW) insulator has been 
shown to recruit MOD(MDG4). Although 
the mod(mdg4) gene encodes for at least 
26 alternatively spliced variants, each 
containing a common N-terminal region 
encoding the ttk-family BTB/POZ do- 
main (Dorn and Krauss 2003; Labrador 
and Corces 2003), SU(HW) insulator 
activity requires one specific isoform, 
MOD(MDG4)2.2 (Gerasimova et al. 1995). 
Staining of Drosophila polytene chromo- 
somes reveals MOD(MDG4)-specihc bands, 
unaccounted for with MOD(MDG4)2.2 
staining alone (Fig. 3 A), suggesting that 
additional isoforms are recruited to DNA. 
Whether dCTCF and BEAF-32 recruit 
unique MOD(MDG4) isoforms is currently 
unknown. We therefore carried out ChlP- 
seq analyses in Drosophila Kc cells using 
two different antibodies that recognize 
either all MOD(MDG4) isoforms or 
MOD(MDG4)2.2. 

The binding profile for MOD 
(MDG4)2.2 is significantly accounted for at SU(HW) sites 
(Fig. 3B). Given that MOD(MDG4)2.2 is required for SU(HW) in- 
sulator activity, the MOD(MDG4)2.2 map may reveal a subset of 
active SU(HW) sites throughout the Drosophila genome. Here, we 
find that the ChlP-seq profile of MOD(MDG4), which includes 
significantly more binding sites than MOD(MDG4)2.2 alone (Fig. 
3B), reveals unique peaks at discrete dCTCF and BEAF-32 sites, 
suggesting that additional isoforms of MOD(MDG4) recruited 
by dCTCF and BEAF-32 must exist (Fig. 3C). Whereas average read 
intensities for MOD(MDG4)2.2 are strongest at SU(HW) sites, 
dCTCF and BEAF-32 sites show an opposite trend, with stronger 
read intensities for MOD(MDG4) (Supplemental Fig. S5). Finally, as 
many as 64% of dCTCF sites and 38% of BEAF-32 sites colocalize 
with an isoform of MOD(MDG4). These data suggest that 
dCTCF and BEAF-32 indeed recruit unique isoforms of 
MOD(MDG4), and that all three Drosophila insulators function 
similarly through the recruitment of BTB domain-containing 
proteins CP190 and MOD(MDG4). 

Aligned dCTCF sites are enriched for CP190, MOD[MDG4) 
isoforms, and additional cofactors 

The tight alignment of dCTCF with BEAF-32 and SU(HW), com- 
bined with their common insulator function, suggests that in- 



2178 Genome Research 



www.genome.org 



Drosophila CTCF and Polycomb repression 



Motif 1 




1,443* 



1,177* 




* Number of dCTCF sites with consensus 





100 




90 




80 


o 






70 


CL 




u 






60 


i 




ites 


50 


un 








H 


40 


U 




T3 




# 


30 




20 




10 




o 




Motif Motif Motif 
1 2 3 



Figure 2. dCTCF sites are enriched for three distinct DNA motifs, including a similar but novel sec- 
ondary motif enriched for insulator protein CP190. (A) Position weight matrices for primary target 
sequence and secondary target sequences obtained by MEME-ChIP, and confirmed with Weeder 1 .3. 
Number of sites provided represents sites in which dCTCF summits are ±1 50 bp from the DNA motif. 
(£) Percentage of dCTCF sites in which CP190 is present when containing each DNA motif. 



sulators synergize, perhaps to create a more active insulator com- 
plex. Given that CP190 and MOD(MDG4) form homo-oligomers, it 
is also intuitive to imagine closely aligned insulators cooperatively 
recruiting CP 190 and MOD(MDG4), increasing the likelihood that 
each member of the insulator cluster is functionally active. In 
support of this hypothesis, we find enrichment for aligned dCTCF 
sites containing CP190 and MOD(MDG4) when compared with 
independent dCTCF insulators (Fig. 3D-F). This suggests that by 
associating with BEAF-32 and/or SU(HW), dCTCF might ensure 
that it will become a functionally active insulator complex by 
recruiting essential cof actors. 

Many additional proteins have been functionally associated 
with insulators in D. melanogaster, suggesting that these insulator 
clusters may represent hubs for recruiting other cof actors. For 
example, Lethal (3) malignant brain tumor [L(3)MBT] has been 
recently shown to colocalize with the Drosophila chromatin in- 
sulators dCTCF, BEAF-32, SU(HW), and CP190 (Richter et al. 201 1), 
and other studies found direct interactions between L(3)MBT and 
the SU(HW) insulator protein (Guruharsha et al. 2011). L(3)MBT 
imparts transcriptional regulation of the Salvador-Warts-Hippo 
(SWH) pathway, likely repressing SWH target genes important for 
cell proliferation and organ size control. Whereas recently pub- 
lished L(3)MBT sites (Richter et al. 2011) are enriched primarily at 
independent dCTCF sites over the BEAF-32 and SU(HW) in- 
sulators, we find enrichment of L(3)MBT sites, identified inde- 
pendently by ChlP-seq in Drosophila Kc cells, at aligned dCTCF 
sites when compared with independent dCTCF sites (Fig. 3D-F). 
In addition to SU(HW), L(3)MBT interacts with a chromodomain 
protein, Chromator, that has also recently been shown to colocalize 
and cooperate with the BEAF-32 insulator (Giot et al. 2003; Gan 
et al. 2011). Indeed, using publicly available ChlP-chip data for 
Chromator in Drosophila Kc cells (Celniker et al. 2009), aligned 
dCTCF sites also show an apparent enrichment for Chromator 
(Fig. 3D-F). Interestingly, Chromator and zinc finger protein Z4 are 
important for maintaining polytene chromosome structure (Eggert 
et al. 2004), suggesting a functional relationship between insulators 
and Chromator in chromatin domain organization. 



Together, these data suggest that 
dCTCF may team up with Drosophila- 
specific insulator proteins in order to 
more efficiently recruit cofactors essential 
for insulator activity. These insulator sites 
are enriched for additional insulator-re- 
lated proteins L(3)MBT and Chromator, 
suggesting that these sites are different 
from independent insulator sites and ap- 
pear to represent large complexes of pro- 
teins associated with insulator activity. 

Aligned dCTCF sites commonly flank 
the borders of H3K27me3 domains 
= pvai < .ooi The correlation of dCTCF with SU(HW) 

and BEAF-32 is striking, but why dCTCF 
clusters with other insulator proteins re- 
quires further exploration. Recent in- 
terrogation of chromosome architecture 
in D. melanogaster revealed recurrent 
combinations of insulators and active 
histone marks at the borders of physical 
domains, including enrichment for 
Chromator (Sexton et al. 2012). Com- 
parison with physical domains analyzed by Sexton et al. (2012) 
reveals enrichment for aligned dCTCF sites within 5 kb of do- 
main borders, suggesting that these tandemly aligned insulators 
are involved in demarcating chromatin domains (Fig. 4A,B). 
Recent studies in both Drosophila and humans have also dem- 
onstrated an enrichment of CTCF and other insulators at the 
borders of H3K27me3 domains (Bartkuhn et al. 2009; Cuddapah 
et al. 2009; Negre et al. 2010). 

In order to determine whether aligned dCTCF insulator sites 
occur at H3K27me3 domain borders in Drosophila Kc cells, we in- 
dependently mapped repressive chromatin domains by ChlP-seq 
against H3K27me3. We find an enrichment of insulator proteins 
immediately outside of H3K27me3 domain borders, and signifi- 
cant enrichment of aligned dCTCF sites within 5 kb of H3K27me3 
domains (Fig. 4C,D). Interestingly, read intensities for each in- 
sulator protein flanking domain borders (Fig. 4D) suggest a peri- 
odicity of insulator presence beginning with dCTCF, consistent 
with the observation that insulators tandemly align. There is 
no significant enrichment for dCTCF aligned with BEAF-32 vs. 
dCTCF aligned with SU(HW) at domain borders (Supplemental 
Fig. S6), suggesting that dCTCF aligns with either BEAF-32 and/ 
or SU(HW) at the borders of repressed chromatin domains. 
However, the functional significance of insulators at chroma- 
tin domain borders and dCTCF alignment remains poorly 
characterized 

RNAi depletion of insulator proteins results in H3K27me3 loss 
within repressed domains 

Previous analyses of H3K27me3 levels immediately flanking do- 
main borders in dCTCF and CP 190 mutants suggest that these sites 
functionally maintain chromatin architecture at these domains by 
preventing the spread of heterochromatin (Bartkuhn et al. 2009). 
Given recent data that CTCF associates with various nuclear pro- 
teins in a context-dependent fashion, it is conceivable that dCTCF 
tightly aligns with other insulators to establish a robust barrier 
insulator at the borders of repressive domains to effectively prevent 
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Figure 3. dCTCF and BEAF-32 recruit isoform(s) of MOD(MDG4) different from MOD(MDG4)2.2. Aligned dCTCF sites are enriched for MOD(MDG4) 
and additional cofactors. (A) Immunofluorescence microscopy of MOD(MDG4) (green) and MOD(MDG4)2.2 (red) on Drosophila polytene chromo- 
somes. MOD(MDG4) staining includes many discrete bands not accounted for by MOD(MDG4)2.2 specific antibodies, depicted with white arrows. (B) 
Genome-wide overlap of dCTCF, BEAF-32, and SU(HW) with MOD(MDG4) and MOD(MDG4)2.2. Many dCTCF (45%) and BEAF-32 (34%) sites overlap 
MOD(MDG4) isoform(s) not represented by MOD(MDG4)2.2. Meanwhile, many SU(HW) (37%) sites overlap MOD(MDG4)2.2 sites, as expected. (C) 
ChlP-seq profile for MOD(MDG4) and MOD(MDG4)2.2 reveals many unique peaks specifically in the MOD(MDG4) profile, accounted for at BEAF-32 and 
dCTCF sites. (D-F) Heatmap representation of percentages of dCTCF sites in which CP1 90, MOD(MDG4), MOD(MDG4)2.2, BEAF-32, SU(HW), L(3)MBT, 
and/or Chromator co-occur at independent dCTCF sites, aligned dCTCF sites, and sites where dCTCF aligns with both BEAF-32 and SU(HW). 



the spread of heterochromatin. Therefore, we next sought to de- 
termine whether insulators are important for maintaining appro- 
priate chromatin architecture and gene expression at these 
H3K27me3 domains by combinatorial knockdown of insulator 
proteins in Drosophila Kc cells (Supplemental Fig. S7). 

Surprisingly, we find no evidence for down-regulation of 
domain-flanking genes compared with genome-wide averages 
when insulators are disrupted genome wide (Fig. 5A), as one 
might expect if heterochromatin spreads beyond domain bound- 
aries. We therefore carried out ChlP-seq for H3K27me3 in Dro- 



sophila Kc cells after dCTCF knockdown. Results revealed decreased 
levels of H3K27me3 immediately within domain borders as well as 
throughout H3K27me3 domains, but not an increase outside of 
domain boundaries (Fig. 5B). H3K27me3 levels were specifically 
reduced in Polycomb (Pc) domains containing dCTCF, indicating 
that loss of H3K27me3 is a direct effect of dCTCF knockdown 
rather than a general consequence of disrupted chromatin ar- 
chitecture (Supplemental Fig. S8). ChlP-PCR against H3K27me3 
levels at several loci confirms significant loss of H3K27me3 in 
response to RNAi depletion of dCTCF, as well as under various 
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conditions of gene expression knockdown for insulator pro- 
teins, including MOD(MDG4), suggesting the recruitment of 
MOD(MDG4) to dCTCF sites and enrichment at aligned in- 
sulators is functionally significant (Fig. 6). Importantly, the ex- 
pression of the Enhancer of zeste [E(z)] gene, which encodes the 
methyltransf erase responsible for H3K27me3, is unaffected by 
any combination of insulator knockdown, and nuclear levels of 
H3K27me3 remain unchanged (Supplemental Fig. S7), indicating 
that this is not an indirect consequence of disruption in meth- 
yltransferase activity. The reduction in H3K27me3 levels sug- 
gests insulators actively play a role important for the mainte- 
nance of H3K27me3 levels within Pc domains. Despite reduced 
H3K27me3 levels within repressive domains, gene expression 



is relatively unaffected for genes within these domains after 
knockdown of insulator proteins (Supplemental Fig. S8), mean- 
ing that the mechanisms underlying gene repression in Pc do- 
mains have not been entirely compromised, or that additional 
steps are necessary for the activation of the Pc domain containing 
genes. 

The even-skipped gene provides a model for dCTCF alignment 
at H3K27me3 domain borders 

In mammals, broad domains of repressive H3K27me3 character- 
ized by Polycomb have been shown to silence clusters of de- 
velopmentally important genes (Bracken et al. 2006; Pauler et al. 
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2009). Recent findings have demonstrated similar repression of 
developmental genes in stable, cell-stage independent H3K27me3 
domains in D. melanogaster (Negre et al. 2011). Genes within 
H3K27me3 domains are highly enriched for developmental genes 
in Kc cells, including the even-skipped (eve) gene (Supplemental 
Table SI), consistent with previous results. The eve gene thus pro- 
vides an excellent model to analyze the role of tandemly aligned 
dCTCF sites and chromatin organization. 

eve is an early pair-rule gene encoding a homeodomain- 
containing transcription factor involved in segmentation 
(Macdonald et al. 1986). Expression of eve peaks within the first 6 h 
of embryogenesis and is essentially nonexpressed in late embry- 
onic/adult Drosophila tissues (Gelbart and Emmert 2010), including 
late-embryonic Drosophila Kc cells (Celniker et al. 2009). eve is one 
of several hundred genes targeted by Polycomb (Pc), and recent 
data suggest that Pleiohomeotic, a Pc DNA-binding protein, 
negatively regulates eve during embryogenesis (Kwong et al. 
2008; Kim et al. 2011). Analysis of the eve locus in Drosophila Kc 
cells reveals H3K27me3 mediated repression in the form of a 15-kb 
H3K27me3 domain (Supplemental Fig. S9). The domain is 



flanked immediately downstream by dCTCF aligned with both 
BEAF-32 and SU(HW), and immediately upstream by a dCTCF site 
aligned with two SU(HW) elements. In both cases, dCTCF binding 
sites are marked by the secondary target sequence identified by 
MEME-ChIP and weederl.3 (Pavesi et al. 2004; Machanick and 
Bailey 2011). These aligned dCTCF sites overlap with ChlP-seq 
profiles for CP190, MOD(MDG4), L(3)MBT, and Chromator, 
consistent with genome-wide enrichments for insulator-associ- 
ated proteins. Knockdown of insulator proteins has no effect on 
the expression of domain-flanking genes CG12134 and TER94 
(Supplemental Table S2), nor does it significantly affect the expres- 
sion of eve. However, knockdown of dCTCF results in H3K27me3 
depletion within the repressed eve domain (Supplemental Fig. 
S9), and knockdown of additional insulator proteins, including 
MOD(MDG4), produces similar results (Fig. 6A,D). Despite loss of 
insulators and H3K27me3 depletion, eve appears to remain re- 
pressed, suggesting that insulator proteins contribute to appropriate 
chromatin domain structure but are not essential for maintenance 
of gene silencing in these domains. Importantly, this model for 
insulator alignment at H3K27me3 domain borders is consis- 
tent throughout the genome, including 
early-stage developmental gene eyes ab- 
sent (eya) and hybrid sterility gene Odys- 
seus-site homeobox (OdsH) (Fig. 6B,C,E,F). 
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Discussion 

Improvements in genomic strategies for 
mapping genome-wide interactions have 
allowed recent studies to probe basic ge- 
nome folding principles as well as in- 
sulator-mediated chromatin interactions 
(Lieberman-Aiden et al. 2009; Handoko 
et al. 2011; Yaffe and Tanay 2011; Dixon 
et al. 2012; Nora et al. 2012; Sexton et al. 
2012). Results consistently support cur- 
rent models proposing roles for insulator 
proteins in chromosome organization 
(Phillips and Corces 2009) and challenge 
the basic barrier and enhancer-blocking 
activities that classically defined these 
proteins. Instead, the ability of insulators 
to block the spread of heterochromatin 
and impede enhancer-promoter interac- 
tions may simply be consequences of a 
more paramount role in chromosome 
organization. New findings in Drosophila 
also suggest that insulators are required to 
mediate long-range interactions impor- 
tant for Polycomb (Pc) repression (Comet 
et al. 2011; Li et al. 2011), and the recent 
identification of CTCF in transcription 
factories (Melnik et al. 2011) suggests that 
insulators may direct the localization of 
specific genomic loci to discrete nuclear 
subcompartments for gene regulation 
(Pirrotta and Li 2011). 

Nevertheless, our finding that het- 
erochromatin does not spread into flank- 
ing chromatin domains in response to 
disruption of insulator proteins is surpris- 
ing based on numerous examples of in- 
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sulator-mediated barrier function. Though individual insulator el- 
ements may indeed serve to prevent the spread of silencing chro- 
matin, our disruption of total insulator protein levels instead sig- 
nificantly affected the levels of H3K27me3 within rather than 
outside of repressive chromatin domains. Knockdown of insulator 
proteins had no effect on the expression of E(z) or total H3K27me3 
levels (Supplemental Fig. S7). Therefore, the loss of H3K27me3 
within Pc domains genome wide suggests that insulators play a 
critical role necessary for the maintenance of appropriate chroma- 
tin architecture at these specific loci. Given the requirement for 
insulators in long-range Pc interactions (Comet et al. 2011; Li 
et al. 2011), we speculate that long-range interactions mediated 
by dCTCF and other Drosophila insulator proteins are ultimately 
disrupted by RNAi depletion of insulator proteins, and that 



H3K27me3 depletion likely reflects a defect in Pc-mediated 
compaction and maintenance of H3K27me3 at developmental 
loci. Interestingly, however, expression of genes within repressive 
H3K27me3 domains was not significantly affected (Supplemental 
Fig. S8), suggesting that Pc-mediated gene silencing was not ab- 
rogated or that additional steps are required to activate these 
developmental genes. Future studies investigating the role of 
insulators in Pc-mediated repression, and the effects of insulator 
disruption in nuclear organization, will provide valuable in- 
sight into the relationship between insulator proteins and 
chromatin architecture. 

The diverse activities of CTCF in gene expression and chro- 
matin organization require exploration of the proteins with 
which it functions and the target sequences associated with spe- 
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cific functions. By combining the resolution conferred by high- 
throughput sequencing (ChlP-seq), with mapping of core target 
sequences, we provide a stringent but exhaustive map of direct 
binding sites for Drosophila insulators and extend our previous 
analyses of dCTCF, SU(HW), BEAF-32, and CP190 to include the 
insulator protein MOD(MDG4). We show that dCTCF aligns with 
both the SU(HW) and BEAF-32 insulators, where dCTCF becomes 
enriched for additional insulator and insulator-associated pro- 
teins. The presence of aligned dCTCF sites at the borders of 
H3K27me3 domains provides an excellent system to query the 
importance of insulator proteins at the boundaries of discrete 
chromatin domains. Recently identified correlations for in- 
sulator proteins at the boundaries of physical domains mapped 
in Drosophila melanogaster (Sexton et al. 2012) provide evidence 
for why only a subset of aligned dCTCF localize to H3K27me3 
domain borders (Fig. 4), and clearly demonstrate that insulators are 
also involved in the organization of other, distinct chromatin 
domains. Whereas Pc-repressed domains are relatively easily 
identifiable in the form of H3K27me3 signatures, future charac- 
terization of discrete physical domains and domain boundaries will 
require genome-wide interrogation of chromosome interactions in 
individual cell types of interest. Nearly 40% of aligned dCTCF sites 
(—355) localize to physical domain boundaries mapped in late 
embryos by Sexton et al. (2012), suggesting that physical domains 
and insulator localization may be conserved at many loci across 
cell types. 

Interestingly, dCTCF appears to target three different se- 
quences in D. melanogaster, including the highly conserved core 
motif for which dCTCF has been described as binding in both 
Drosophila and mammals (Holohan et al. 2007). The secondary 
motif appears highly similar to the conserved core consensus 
(AGGNGGC) with an insertion between the first pair of guanines 
(AGTGTGGC), and average dCTCF levels suggest that this repre- 
sents a low occupancy and potentially lower-affinity binding site. 
These novel dCTCF sites are highly enriched for insulator protein 
CP190 when compared with its primary target sequence. This 
finding, combined with previous data indicating that CP 190 is 
essential for dCTCF binding to a subset of its target sites, suggests 
that CP 190 might facilitate dCTCF binding to these secondary 
sites. The absence of CP 190 in vertebrates may explain why these 
sequences have not been identified as mammalian target sequences, 
raising the possibility that these binding sites are a Drosophila- 
specific phenomenon. 

Analysis of dCTCF insulator alignment at the eve locus and 
genome wide uncovers a tight association with BEAF-32 and 
SU(HW), which may provide dCTCF with numerous advantages 
for effectively establishing a functional insulator. First, alignment 
of multiple insulator DNA elements may increase the likelihood of 
sequence accessibility at important loci, as insulator-binding sites 
have been characterized by reduced nucleosome density (Negre 
et al. 2010). For example, an insulator-binding protein may access 
its cognate sequence, thereby creating an accessible landscape 
for other, potentially different insulator proteins to bind their re- 
spective targets. Second, by aligning in close proximity, recruitment 
of essential insulator proteins [i.e., CP190 and MOD(MDG4)] by 
one insulator-binding protein may facilitate recruitment by others, 
given that CP 190 and MOD(MDG4) may be recruited as multi- 
mers. Third, given that dCTCF binds secondary sites that poten- 
tially require CP 190, recruitment of CP 190 by a neighboring 
insulator [i.e., SU(HW) or BEAF-32] may preclude dCTCF binding, 
thereby providing a regulatory step in dCTCF recruitment to DNA. 
Finally, by aligning with SU(HW) and BEAF-32, dCTCF establishes 



a unique identity compared with independent dCTCF sites, where 
it becomes enriched for additional cofactors, including L(3)MBT 
and Chromator (Fig. 7). 

Though our data shed new and valuable insight into 
what appears to be cooperative insulator function in Drosophila 
melanogaster, many questions remain. Given current models that 
insulators function via intra- and interchromosomal interactions, 
it is plausible that aligned dCTCF sites and their enrichment 
for CP190 and MOD(MDG4) allow for stable chromosomal in- 
teractions. Current locus- and genome-wide interaction assays 
may effectively answer this question in the near future. While 
BEAF-32 has been defined as lineage specific (Schoborg and 
Labrador 2010), and SU(HW) appears to lack a counterpart in 
mammals, our results suggest that mammalian CTCF may align 
with other, unique DNA-binding proteins important for appro- 
priate insulator function at the boundaries of Pc domains. 

Methods 

ChlP-seq 

Chromatin immunoprecipitation was performed as previously 
described (Bushey et al. 2009). For Re-ChIP assays, chromatin was 
eluted in 1% SDS, 0.1 M NaHC0 3 , diluted 10-fold in IP dilution 
buffer (0.01% SDS, 1.1% Triton X-100, 1.2 mM EDTA, 16.7 mM 
Tris-HCl, 167 mM NaCl), and ChIP repeated using antibodies 
against BEAF-32 or SU(HW). ChIP of MOD(MDG4) was carried 
out with antibodies against the mod2.2 isoform (a-Rabbit; gift 
from Elissa Lei [NIDDK, NIH]) and against the region shared by all 
isoforms as previously described (Bushey et al. 2009). ChIP for 
L(3)MBT in Drosophila Kc cells was carried out using a previously 
described antibody (a-Guinea-pig; gift from Jurgen Knoblich) 
(Richter et al. 2011), and ChIP against H3K27me3 was performed 
using a commercially available polyclonal antibody (Millipore 
Cat# 07-449). To generate sequencing libraries, ChIP DNA was 
prepared for adaptor ligation by end repair (End-It DNA End Re- 
pair Kit, Epicenter Cat# ER0720) and addition of "A" base to 3 ' ends 
(Klenow 3-5' exo- NEB Cat# M0212S). Illumina adaptors (Illu- 
mina Cat# PE-102-1001) were titrated according to prepared DNA 
ChIP sample concentration and ligated with T4 ligase (NEB Cat# 
M0202S). Ligated ChIP samples were PCR-amplified using Illu- 
mina primers and Phusion DNA polymerase (NEB Cat# F-530L) 
and size selected for 200-300 bp by gel extraction. ChIP libraries 
were sequenced at the HudsonAlpha Institute for Biotechnology, 
using an Illumina HiSeq 2000. Sequences were mapped to the dm3 
genome with Bowtie 0.12.3 (Langmead 2010) using default set- 
tings. Peaks were then called with MACS 1.4.0alpha2 (Zhang et al. 
2008) using equal numbers of unique reads for input and ChIP 
samples and a P- value cutoff of 1 x 10~ 10 . 

ChlP-seq and bioinformatics analyses 

Previously published ChlP-seq data are available from GEO ac- 
cession GSE30740 (Wood et al. 2011). DNA sequence motifs 
present in binding sites for dCTCF, BEAF-32, and SU(HW) were 
identified using commonly called peaks from three independent 
biological samples (and thus represent insulator binding sites of 
highest confidence), Drosophila Kc cells treated with ecdysone at 0, 
3, and 48 h (Wood et al. 2011). Primary motifs were identified by 
MEME-ChIP using default settings (Machanick and Bailey 2011). 
dCTCF motif 2 was identified in both MEME-ChIP and Weeder 1.3 
(Pavesi et al. 2004), and motif 3 in MEME-ChIP by excluding peaks 
containing the primary conserved motif. Insulator peaks were then 
trimmed to include only those containing core consensus se- 
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Figure 7. Diagram comparing independent dCTCF sites and dCTCF 
sites aligned with BEAF-32 and SU(HW). dCTCF, BEAF-32, and SU(HW) 
function similarly through the recruitment of CP190 and MOD(MDG4). 
Aligned dCTCF sites are enriched for the secondary DNA sequence and 
CP190. Ultimately, alignment may allow for cooperative recruitment of 
CP190 and MOD(MDG4), ensuring that dCTCF establishes a functional 
insulator complex at domain borders. Recruitment of additional proteins, 
such as L(3)MBT and Chromator, may also contribute to insulator activ- 
ities at these loci. 



quences for each protein using ambiguity codes specified by 
MEME-ChlP. For dCTCF this included both the motifs described 
(AG [GA] [TG] GGCGC (allowing for one mutation), [AG]GTGT[GT] 
[GA]CC (allowing for one mutation), and GGT[TG] [TGC] [GA] [TA] 
[GA][TA]C[TC][TC][CGT]GCTA (allowing for one mutation). For 
BEAF-32, this included the previously identified motif [ATG] [TGC] 
CGATA with no mutations allowed and for SU(HW) the motif GC 
[AC]TA[CT]TTT allowing for one mutation. Direct insulator bind- 
ing sites were thus finally called as summits identified by MACS in 
three independent biological samples ±150 bp that contain the 
described consensus sequence specific for each insulator protein. 
Overlap between insulators and associated proteins were identified 
using publicly available tools on Galaxy (Giardine et al. 2005; 
Blankenberg et al. 2010; Goecks et al. 2010). 

H3K27me3 domains were called using H3K27me3 ChlP-seq 
data obtained here in Drosophila Kc cells, with comparison to 
publicly available H3K27me3 ChlP-seq data in late embryos and 
the added requirement for Polycomb occupancy/signal in Dro- 
sophila Kc cells (Celniker et al. 2009). Domain borders were called 
as 0th nucleotide of peaks called, and organized by K-means clus- 
tering by Cluster 3.0 (de Hoon et al. 2004). Genes within H3K27me3 
domains were called as intersecting (>300 bp) H3K27me3 domains 
using publicly available tools on Galaxy (Giardine et al. 2005; 
Blankenberg et al. 2010; Goecks et al. 2010). Comparisons between 
histone H3K27me3 before and after dCTCF knockdown were 
performed after rank order normalization, as recently described 
(Whyte et al. 2012). Briefly, these ChlP-seq data sets are rank- 
ordered in 10-bp bins across the Drosophila genome, from highest 



to lowest read intensity. Averages between the two data sets are 
then assigned to each bin — from highest to lowest read values. 

Enrichments for insulator-associated proteins at aligned 
dCTCF clusters were calculated as percentage of co-occurrence 
between dCTCF and BEAF-32, SU(HW), CP190, MOD(MDG4), 
MOD(MDG4)2.2, L(3)MBT, and Chromator at independent 
dCTCF sites, aligned dCTCF sites, and sites hosting dCTCF + BEAF- 
32 + SU(HW). Results were hierarchically clustered using cluster 3.0 
(de Hoon et al. 2004) and visualized by Java Treeview (Saldanha 
2004). 

Immunofluorescence microscopy 

Immunofluorescence microscopy of polytene chromosomes was 
done as previously described (Ivaldi et al. 2007). Cells were stained 
with primary antibodies in antibody dilution buffer (1 X PBS, 0.1% 
Tween20, 1% BSA) overnight at 4°C (1:100 rabbit a-MOD(MDG4)2.2) 
(gift from Elissa Lei), 1:100 rat a-MOD(MDG4) (Pai et al. 2004). 

Real-time PCR analysis 

Real-time PCR analyses for H3K27me3 levels in insulator knock- 
down experiments and Re-ChIP were performed with independent 
ChIP samples. Fermentas Life Sciences Maxima qPCR SYBR Green 
ROX Mix (#K0222) was used and percent input was calculated with 
a three-point standard curve from the input sample. ChIP DNA 
and input DNA concentrations were calculated using a Qubit 2.0 
fluorometer HS assay (Invitrogen Q32851). ChIP DNA concentra- 
tions were consistently lower in insulator knockdown conditions, 
and thus normalized by equal ChlP/input DNA ratios before qRT- 
PCR. Primers used for both analyses are provided in Supplemental 
Table S3. 

Gene expression analyses 

RNAi knockdown in Drosophila Kc cells culture was conducted 
as per the Drosophila RNAi Screening Center (DRSC) protocol 
(Armknecht et al. 2005), with the exception that dsRNA was added 
every day for 3 d and the cells were then collected on the fourth 
day. Also, multiple amplicons targeting each gene for knockdown 
were used, with the exception of BEAF-32, which only used one. 
RNA was then isolated from the Kc cells using the Qiagen RNeasy 
kit (catalog #74104) with on-column DNA digestion (catalog 
#79254) following the manufacturer's protocols. cDNA synthesis 
was then performed using the Applied Biosystems High Capacity 
cDNA Reverse Transcription Kit (catalog #4368814). cDNA was 
hybridized to a NimbleGen D. melanogaster Gene Expression 
12X135K Array based on D. melanogaster annotation DM5.45 at 
the Florida State University-NimbleGen Microarray Facility. Ex- 
pression analysis of variance was performed using Partek software, 
version 6.5. A list of primers used for amplicon formation are 
provided in Supplemental Table S4. 

Data access 

Gene expression and ChlP-seq data have been submitted to the 
NCBI Gene Expression Omnibus (GEO) (http://www.ncbi.nlm. 
nih.gov/geo/) under accession numbers GSE36944 [L(3)MBT and 
MOD(MDG4)2.2/MOD(MDG4) in Drosophila Kc cells], GSE37444 
(H3K27me3 in Drosophila Kc cells — control and dCTCF knock- 
down), and GSE36393 (Gene expression in Drosophila Kc cells 
before and after insulator knockdown). 
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